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The reconstruction of interaction vertices can be decomposed into a pattern recognition problem ("vertex 
finding") and a statistical problem ("vertex fitting"). We briefly review classical methods. We introduce novel 
approaches and motivate them in the framework of high-luminosity experiments like at the LHC. We then show 
comparisons with the classical methods in relevant physics channels. 



1. INTRODUCTION 

Vertex reconstruction algorithms face new chal- 
lenges in high-luminosity scenarios such as the LHC 
experiments. Vertex finding algorithms have to be 
able to disentangle the tracks of vertices in difficult 
topologies, such as from decay vertices which are very 
close to the primary vertex or decay chains with very 
small separations between the vertices. Vertex fitters 
will need to be robustified, since outliers and non- 
Gaussian tails in the distributions of the errors of the 
track parameter will occur frequently. 

We pursue extensive studies of vertex reconstruc- 
tion algorithms that are capable of dealing with am- 
biguities and track mis-reconstructions. Section 
discusses robustifications of vertex fitting algorithms. 
Section[3]presents novel approaches to the vertex find- 
ing problem, derived from the clustering literature. 



2. VERTEX FITTING 

Robustified vertex fitting has already been dis- 
cussed in p|; we shall only briefly review this topic 
here. 

The classical methods in this field are least-square 
methods. The breakdown point of LS estimators is 
zero, which means that even a single outlier track can 
bias the resulting fit significantly. For noisy environ- 
ments such as the LHC experiments robustifications of 
the classic LS methods were investigated. We suggest 
three new methods: 
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• Adaptive method: Instead of minimizing the 
sum of residuals we minimize a weighted sum 
of squared residuals: 

n 

^Adaptive = argmin^ (w, • rf(x)) 
3 i=i 

Outliers are not discarded but downweighted ac- 
cording to the weight function 



Here r c denotes a cutoff parameter, while (3 = 
1/(2T) introduces a temperature that is reduced 
in each iteration step in a well-defined annealing 
schedule. An iterative weighted LS procedure is 
used to find this minimum. 

• Trimming method: We minimize only a user- 
defined fraction of the sum of the squared resid- 
uals: 

^Trimming = argmin ^ rf (x) 

s i=l 

A fast method that finds this minimum is de- 
scribed in |3j]. 

• LMS: We minimize the median of squared resid- 
uals: 

x-lms = argminmed (r?(x)) 

X 

Only a simplified algorithm has so far been 
found that is compatible with our CPU con- 
straints. This algorithm works separately on 
each coordinate of the points of closest approach 
of the tracks with respect to a vertex candidate. 
This ignores the spatial structure of the data. 
A full 3d method that works within our CPU 
requirements is still searched for. 
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The conclusions that we draw are as follows |4|: 

• the adaptive method should be considered a 
good default method; it deals with a great many 
different situations in an optimal or nearly op- 
timal way. It leaves clean vertices almost un- 
affected, while at same time it is a very robust 
algorithm. 

• The trimming vertex fitter may be interesting if 
the number of outliers is known in advance. In 
any other situation it is inferior to the adaptive 
method. 

• The LS fit is the fastest 3D fit, and optimal for 
extremely pure data. 

• The coordinate-wise LMS fit is the fastest 
method but it is very unprecise (a few hundred 
microns compared to a few tens for primary ver- 
tices). It can nevertheless be used to provide a 
first guess of the vertex position. 



3. VERTEX FINDING 

We categorise the set of vertex finding algorithms 
into hierarchic and non-hierarchic methods. Hierar- 
chic methods are algorithms whose workings can be 
visualised with a dendrogram. Hierarchic methods 
can further be split into divisive and agglomerative 
methods. 

Divisive methods start with one cluster that con- 
tains all tracks; after each iteration a certain subset 
of tracks is split off from the cluster into its own clus- 
ter, which may in turn itself be split into sub-clusters. 
All algorithms stop until a certain formal criterion is 
met. 

Agglomerative methods start assigning a singleton 
cluster to every single track. The most compatible 
clusters are then merged in every iteration step. Again 
the procedure is stopped when a formal condition is 
met. The most decisive factor in these methods is the 
metric that is employed to compute the compatibility 
between two clusters. 

Let a and /3 denote two clusters. Let further s be 
the set of all minimum distances between track pairs 
with one track in cluster a and the other in cluster (3. 
We can now choose as the metric e.g.: 

d(a,P) = min(s), max(s), s, median(s), . . . (2) 

The choice d{a,(3) = min(s) implements a single 
linkage or minimum spanning tree procedure, whereas 
d(a,{3) = max(s) is often referred to as a complete 
linkage. 

The following theorem significantly reduces the 
number of reasonable choices: 




Figure 1: Schematic description of how the triangle 
inequality is violated in the track clustering problem. 



Theorem: The triangle inequality does not gener- 
ally hold for the minimum distances between a set of 
n tracks. 

Proof: Let A, B, C denote three tracks. Let A and 
B share one common vertex Vi; let further B and C 
also share one common vertex V%. Then: 

AB = e,BC = e,AC = rf> e 

-^AB + BU<^AC q.e.d. (3) 

This means that the choice d(a, [3) = min(s) would 
cluster A, B and C into a single vertex. We can 
therefore safely discard single linkage from the list of 
promising algorithms. 

Until now the best results were obtained with the 
choice d(a,(3) = max(s), i.e. with a complete linkage 
procedure. 

An alternative to the above metrices is of course to 
fit vertices for each cluster with more than one track, 
and use these vertices as "representatives" of the clus- 
ter. 

3.1. Finding-Thru-Fitting 

The most mature algorithm in CMS is the "princi- 
pal vertex reconstructor" , also known as the "finding- 
thru-fitting" method. It is a divisive method that in- 
ternally uses a fitter and a track-to-vertex compati- 
bility estimator to decide which tracks are to be dis- 
carded at each iteration step. The maturity of the 
implementation and the algorithmic simplicity make 
it an ideal baseline for performance evaluation. 

3.2. Apex points 

In order to overcome the topological problems de- 
scribed in section |2| we conceived another approach: 
the apex point formalism. The main concept is that 
the tracks are substituted by representative points, 
the apex points. These points should fully represent 
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the tracks with respect to the vertex finding prob- 
lem at hand. The space that the apex points are 
defined in can have any dimension; it must only be 
equipped with a proper metric fulfilling the trian- 
gle inequality. Our current implementation produces 
three-dimensional points in a Euclidean space, to- 
gether with a 3x3 error matrix. Note that the apex- 
point-to-track mapping needs not be unique; it may 
very well be necessary that n apex points, n > 1, rep- 
resent one track. 

One can of course also formulate hierarchic cluster- 
ing methods on top of the apex points. 

3.3. Apex point finders 

An algorithm that searches for such representative 
"apex" points is called an apex point finder. Since 
these finders operate on the points of closest approach, 
they can be formulated as generic pattern recogni- 
tion problems in one dimension (i.e. along the tracks). 
Thus the set of potential apex point finding algorithms 
is huge; a systematic effort to choose algorithms that 
satisfy our needs is an ongoing process. So far we have 
only investigated a few simple methods: 

• The HSM (half sample mode) finder iteratively 
calls an LMS estimator on the points of closest 
approach. 

• The MTV (minimal two values) finder looks 
for the two adjacent points of closest approach 
whose sum of distances to their counterparts on 
the other tracks is minimal. 

• The MAMF (minimum area mode) finder looks 
for the two adjacent points whose sum of dis- 
tances to their counterparts times their distance 
is minimal. 

Future research will try more sophisticated al- 
gorithms such as a deterministic annealing 0, 
method, or a gravitational clusterer [jj. 

3.4. Global association criterion 

The weights that have been introduced for the adap- 
tive fitting method (see section fj) can also be used to 
define a global "plausibility" criterion of the result of 
a vertex reconstructor. With m being the total num- 
ber of tracks and n being the number of vertices we 
define the global association criterion (GAC) by: 

^ m n 

where 

= (l-w zj ifiej 
I w ij otherwise 




Figure 2: Two apex point finders at work. 



and Wij is the weight Wi Q of track i with respect to 
vertex j. 

The most important open question with respect to 
this criterion is how it relates to the "Minimum Mes- 
sage Length" [ljj . Can the information theoretic limit 
of the vertex finding task be formulated in terms of 
the GAC? 

The potential uses of such a criterion are manifold: 

• Exhaustive vertex finding algorithm. All combi- 
nations of track clusters could at least in princi- 
ple be iterated through, then one can decide for 
the smallest GAC found. 

• Stopping condition. The GAC could also serve 
as a stopping condition in a wide range of algo- 
rithms. 

• Super finder algorithms. One could also use it 
to resolve ambiguities. More than one vertex 
reconstructors could be used on one event, the 
GAC could then decide for the "better" solution. 

Clearly, some more research in this direction will 
have to be done. 

3.5. Learning algorithms 

"Learning algorithms" can easily be formulated on 
top of the apex points; some can also work on the 
distance matrix itself. Good candidates for such algo- 
rithms are: 

• Vector quantisation or the k-means algorithm, 
which have dynamic vertex candidates ("proto- 
types") that are attracted by the apex points. 

• Potts neurons [TH or the super-paramagnetic 
clustering algorithm these algorithms at- 
tribute a spin-state or a mathematical equiva- 
lent to every apex point. Spin-spin correlations 



TULT013 



4 



CHEP03, La Jolla, California, March 24-28, 2003 



together with an annealing schedule will then 
make sure that similar apex points are described 
by the same spin vector. 

• Deterministic annealing (tJ; this method for- 
mulates the clustering problem as a thermody- 
namic system with phase transitions, each tran- 
sition introducing a new separate cluster of apex 
points. 

3.6. Simulation experiment 

We compared one of our algorithms with two stan- 
dard vertex finding procedures: the PVR (see sec- 
tion E3J and the so-called DOPhi method jjj, § - 
a special purpose algorithm based on the impact pa- 
rameters of the tracks at the beamline. As a novel 
method to compare with we chose an agglomerative 
clustcrer with a vertex fits as "representative points" , 
as it was explained in the last paragraph of section [3J 
Our testing was done with 1000 Monte Carlo 50 GeV 
bb events, generated with the CMSIM simulation pro- 
gram 01 ■ Before the actual comparison all algorithms 
were automatically fine-tuned to maximize the follow- 
ing "score function" : 

Score = 10 • Eff Prim • Eff So c ■ Pur ,^ • Pur° c 2 c 5 • 
AssEff£ r 2 4 • AssEff^ 25 • (1 - Fake) 5 

"Eff" , "Pur" , "AssEff" , and "Fake" denote the per- 
formance estimators described in [l|; "Prim" denotes 
primary vertices, "Sec" stands for secondardy vertices. 

3.7. Results 

In the inclusive secondary vertex finding scenario, 
the agglomerative fitting procedure finds up to 80 per- 
cent of the secondary vertices, as opposed to the 50 - 
60 percent found by older algorithms. Note that the 
DOPhi algorithm is not intended to find any primary 
vertices, hence the total score parameter is meaning- 
less. See figure [3] for the complete comparison. 



4. CONCLUSIONS 

We have reached a good understanding of the ro- 
bustification methods of the classical LS vertex fitters. 
We suggest that the adaptive method be used as the 
new default fitting procedure for CMS and possibly 
other experiments as well. Surely, we still lack such 
an exhaustive understanding in the case of the much 
more complex task of vertex finding, although here 
we were able to exclude certain classes of algorithms 
on the basis of purely theoretic considerations. Our 
first results are most promising; we can quite clearly 
demonstrate that with respect to e.g. secondary vertex 




Figure 3: Analysis of performance: one novel 
agglomerative finder compared to older vertex finding 
algorithms. The agglomerative method has a much 
higher secondary vertex finding efficiency, while it reports 
about the same fake rate. 

finding classical methods such as the "finding-thru- 
fitting" algorithm can be surpassed by far. Future 
research will mainly focus on three areas: 

• apex point finding algorithms, 

• "learning" algorithms, 

• the global association criterion and its implica- 
tions. 
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